G-EVAL EVALUATION SUMMARY
==================================================

Total interactions evaluated: 52
Average fidelity: 0.800
Average relevance: 0.800
Average context accuracy: 0.800
Average context recall: 0.800
Average overall score: 0.800